Real-valued All-Dimensions Search: Low-overhead Rapid Searching over Subsets of Attributes
نویسندگان
چکیده
This paper is about searching the combina torial space of contingency tables during the inner loop of a nonlinear statistical optimiza tion. Examples of this operation in various data analytic communities include search ing for nonlinear combinations of attributes that contribute significantly to a regression (Statistics), searching for items to include in a decision list (machine learning) and associ ation rule hunting (Data Mining). This paper investigates a new, efficient ap proach to this class of problems, called RAD SEARCH (Real-valued All-Dimensions-tree Search). RADSEARCH finds the global op timum, and this gives us the opportunity to empirically evaluate the question: apart from algorithmic elegance what does this attention to optimality buy us? We compare RADSEARCH with other recent successful search algorithms such as CN2, PRIM, APriori, OPUS and DenseMiner. F i nally, we introduce RADREG, a new regres sion algorithm for learning real-valued out puts based on RADSEARCHing for high order interactions.
منابع مشابه
FUZZY HYPERVECTOR SPACES OVER VALUED FIELDS
In this note we first redefine the notion of a fuzzy hypervectorspace (see [1]) and then introduce some further concepts of fuzzy hypervectorspaces, such as fuzzy convex and balance fuzzy subsets in fuzzy hypervectorspaces over valued fields. Finally, we briefly discuss on the convex (balanced)hull of a given fuzzy set of a hypervector space.
متن کاملDetecting Anomalous Groups in Categorical Datasets
We propose a new method for detecting groups of anomalies in categorical datasets. Our approach is a generalization of the spatial scan statistic, a commonly used method for detecting clusters of increased counts in spatial data. We extend this framework to non-spatial datasets with discrete valued attributes, where the degree of anomalousness of each record depends on its attribute values and ...
متن کاملAn Evolutionary Algorithm Integrating Discretization of Continuous-valued Attributes with Learning Decision Rules
A new method of learning decision rules from databases, which uses an evolutionary algorithm, is proposed. The main diierence between our approach and the others described in the literature is the way of processing of continuous-valued attributes. Most decision rule learners process separately these attributes when searching for threshold values, which may decrease the performance. In contrast ...
متن کاملReal-Valued Schemata Search Using Statistical Confidence
2 Neural Network & Machine Learning Laboratory Computer Science Department Brigham Young University Provo, UT 84602, USA E-mail: [email protected] WWW: http://axon.cs.byu.edu Abstract. Many neural network models must be trained by finding a set of real-valued weights that yield high accuracy on a training set. Other learning models require weights on input attributes that yield high leave-one...
متن کاملAn Efficient Scheme for Real-time Information Storage and Retrieval Systems: A Hybrid Approach
Information storage and retrieval is the fundamental requirement for many real-time applications. These systems demand that data should be sorted all the time, real-time insertion, deletion and searching should be supported and system must support dynamic entries. These systems require search operations to be performed from massive databases implemented by various data structures. The common da...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2002